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Abstract. The Linear Search Problem is studied from the view point of Hamiltonian 
dynamics. For the specific, yet representative case of exponentially distributed position 
of the hidden object, it is shown that the optimal orbit follows an unstable separatrix in 
the associated Hamiltonian system. 



1. Introduction 

The Linear Search Problem has a venerable history, going back to R. Bellman ('63) and 
A. Beck ('64). They looked into the following question: 

An object is placed at a point H on the real line, according to a known 
probability distribution. A search plan (or trajectory) is a sequence x = 
{xi}^ with . . . — x^< —X2 < < xi < X3 < . . . (or . . . — X3 < —x\ < < 
X2 < X4 < . . .). A search is performed by a searcher walking alternating to 
the points of the search plan, starting at 0, until the point H is found. 

The total distance traveled till the point is found is L(x, H), and the cost of the search 
plan x is given by 

E(x) = E[L(x,H)]. 

The task is to find the plan x minimizing E(x). We are therefore in the average case 
analysis situation. 

The search problem has been also studied in theoretical computer science, see e.g. |14j . 
where it was called cow-path problem. There have been many interesting generalizations 
such as search on rays, rendezvous, search with turn cost etc. [8] 1 101 IT]. Finally, there is 
some recent work in connection with robotics, see e.g. [13] . 

1.1. Background on Linear Search Problem. This Linear Search Problem was studied 
mostly by Anatole Beck and his coauthors in a series of papers where they analyzed to 
great details the archetypal case of normally distributed H (see [HI El HI [12] ) . It turned 
out that the candidates for optimal trajectories form a 1-parametric family (parameterized 
by the length of the first excursion |xi|). Using careful analysis Beck further reduced the 
choice of the candidates to just two initial points, of which one turned out to be the best 
by numerics. On the nature of these initial points, [7] stated: 

...we opine that this is a question whose answer will not shed much mathe- 
matical light. 
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This note aims at uncovering the underlying geometric structure of the Linear Search 
Problem. Specifically, we argue that the correct framework here is that of Hamiltonian 
dynamics, especially where hyperbolicity of the underlying dynamics can be deployed. In 
our geometric picture the mysterious two points naturally appear at the intersection of a 
separatrix (that is present in the associated Hamiltonian system) with the curve of initial 
turning points. 

To this end we analyze in detail a one-sided version of the Linear Search Problem which 
we describe next. The original problem considered by Beck is addressed from the same 
viewpoint in the appendix. 

We restrict our proofs mostly to the exponentially distributed position H: this is done 
primarily to keep the presentation succinct and clear. In the appendix we demonstrate 
that our approach with small modifications works for some other distribution, e.g. for 
one-sided Gaussian. We believe that even more general classes of distributions can be also 
analyzed - this will be done in a follow-up paper. 

1.2. Half-line problem. We concentrate here on a one-sided gatherer version of the search 
problem. Here, the hiding object H is located on the half-line M+, according to some 
(known) probability distribution. One searches for H according to the plan 

X = {0 = Xo < X\ < X2 < ■ ■ ■ < Xk < ■ ■ •}, 

and stops after the step n = n(x,H) iff the point H G (x n -i,x n ]. One can think of a 
gatherer who mindlessly collects anything on the way, bringing the loot to the origin, where 
the results are analyzed (in a contrast to the searcher, who stops as soon as the sought 
after object is found). 

As in the original version, one needs to minimize the average cost of the search, which 
in our case is given by 

~n(x,H) 

E(x) = E[L(x,H)] = E 



fc=i 



Xk 



(1) 



f Time 




Figure 1. Linear Search Problem: two-sided searcher on the left, one-sided 
gatherer on the right. The cost of the indicated plans given the positions of 
the hidden objects are shown by darker shade. 



1.3. Motivation. One-sided linear search appears naturally in quite a few applications. 
The initial motivation was the problem of search in unstructured Peer-to-Peer storage 
systems, analyzed in j^, where the relevance of Hamiltonian dynamics was first noticed. 
In such an unstructured network, one is sequentially flooding some (hop-) vicinity of a node, 
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Figure 2. Search for an object in Peer-to- Peer unstructured network. The 
object is found after the 3rd flooding. 

see Figure [2| with request for an item, setting the Time-to-Live at some limit, until the 
item is found. The cost of a plan is the total number of queries at all nodes of the network, 
representing the per query overhead. 

Further applications include robotic search, where one deals with programming a robot of 
low sensing and computational capabilities, unable to recognize the objects it collects. Also 
the problem of efficient eradication of unwanted phenomena (say irradiation of a tumor) 
can be mapped onto our model. 

1.4. Outline of the results. We start with the general discussion of the one-sided search 
problem, showing in particular that the natural necessary condition of optimality implies 
that the optimal plan should satisfy a three-term recurrence, the variational recursion (a 
discrete analogue of the Euler-Lagrange equations). This reduces the dimension of phase 
space, but also introduces Hamiltonian dynamics. 

We analyze in details a "self-similar" case of homogeneous tail distribution function, also 
called a Pareto distribution, and see that the phase space is split naturally into a chaotic 
and monotonicity regions, divided by a separatrix. 

Hamiltonian dynamics associated to the variational recursion is then studied. We set 
up the stage for a general distribution, but mostly constrain our proofs to the case of the 
exponentially distributed position H of the object, i.e. to the case of 

/(a?) := F(H > x) = exp(-x). 

We prove that the optimal trajectories should start at the separatrix^] On the other hand, 
the plans satisfying variational recursion are represented by a one- dimensional curve. The 
intersection of the separatrix with the curve gives two candidates for the starting position, 

lr This connection between energy minimizing orbits and invariant sets is reminiscent of the Aubry-Mather 
theory [3]. There energy minimization is used to prove existence of the so-called Aubry-Mather sets. Here 
we proceed in the other direction: we establish an invariant set in order to find minimal "energy" orbits. 
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mirroring the situation in the original setting of Beck et aFs papers. We conclude with 
several open questions. 

Occasionally, we use several standard notions from the theory of dynamical systems; for 
definitions we refer to |15] . 

2. Basic properties 

2.1. Basic notions. The input into the search algorithm is a plan, or a trajectory 

x = {x = 0, xi, . . . x k , ■ ■ •}, x k >0,x k ^ oo, 

that is an unbounded sequence of turning points. Below we list some simple properties of 
the cost functional ([I]): 

Proposition 1. The cost of a plan is given by 

oo oo 

E(x) = ^x fe P(# > x fc _i) = 5^Xfc/(x fc _i). (2) 
fe=l fc=l 

Any optimal search plan is strictly monotone. In other words, if a plan x = {x n } is not 
strictly increasing, there is a naturally modified strictly monotone plan x = {x n } such that 
E{x) < E(x). 

Proof. The contribution to the average cost is the length of excursion times the probability 
that such excursion will have to occur: 

E(x,H) = ^x k ■ 1(H > x k -!) 
k 

which implies ([2]). 

Now, assume that a plan x is not strictly monotone. Consider a modified plan x, where 
the turning points preventing strict monotonicity are removed. Then, as can be verified 
by straightforward estimates, E(x) < E{x). □ 

Proposition 2. If the position of the object is known, then the cost of its recovery, 
L := E[iZ], is a lower bound on the cost of any trajectory 

E(x) > L. 

There exists a plan of cost at most 4L + e (thus finite if L is). 
Proof. First, note that the sum 

oo 
k=0 

is bounded below by the integral 

/•oo 

/ f(x)dx = L. 
Jo 



k-1 
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Next, observe that 

roo poo 

L = E[H] = - x- f'(x)dx = / f(x)dx, 
Jo Jo 

by definition and using integration by parts once. 

Then, using monotonicity of / we estimate this integral from below 

L = / f{x)dx = f(x)dx > - x k )f(x k+ i). 

Jo k=o Jxk k=o 

Evaluating the expression on the right over the geometric sequence xq = 0, x k = A ■ 2 
(k = 1, 2, . . .), we have 

^ 00 

k=0 

Adding x\ = A to both sides, we obtain 

4.L + A > £?(x), 

which proves the claim since A can be taken arbitrarily small. □ 

If the tail distribution function is continuously differentiable (or even Lipshitz) on [0, 00), 
then the optimal trajectory does exist. In particular, one need not consider bi-infinite 
trajectories {0 < ... < x_2 < £-1 < %i < ^2 < •••}• This is an extension of the 
corresponding result for the two-sided search, see e.g. [6j. However, for completeness, 
we give an independent proof in the next section. The Lipshitz property is essential, as 
was also observed by Beck and Franck, since one can construct an example for which no 
sequence with finitely many terms near zero is optimal. In other words, there is no first 
turning point, see example in the next section. 

2.2. Variational recursion. Optimality of a sequence implies a local condition. 

Proposition 3. Assume the tail distribution function f(x) = ¥(H > x) is differentiable. 
If the plan x is optimal, then the terms {x k } satisfy the variational recursion: 

f(x n -l) + X n+ lf'(x n ) = 0. (3) 

Proof. It is immediate, if one notices that the cost depends on x k via only two terms, 
f{x k -i)x k and f(x k )x k+1 . □ 

This allows us to find x n+ \ as a function of x n _i,x n , 

/Qn-l) 
f'{x n ) 

and to reconstruct the whole optimal plan from its first two points, xo = and x\. 

In fact, it is useful to think of {x k } k =o,i,... as of iterations of the mapping R : — > 
given by 

R:(x,y)^(y,-f(x)/f(y)) 
(which we will still be referring to as variational recursion). 
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3. Existence of an optimal sequence 

For the two-sided (Beck-Bellman) search problem, the existence of the optimal search 
plans was shown in [111 [6] and some improvements appeared in the subsequent papers. For 
completeness, we supply the existence proof for the one-sided case, as we consider in detail 
the associated nonlinear map. 

Recall the cost functional 

oo 

£( x ) = ^2%k+lf{xk) 
k=0 

and formulate the minimization problem: 

E = inf {£(x),x = (. . . ,x-2,x-i,x ,xi,x 2 , ~.,x k , ...),Xj > 0, j £ N,x k -> oo} . (4) 

Note that we do not restrict the sequence to have the first term. We will prove this. On 
the other hand, if f(x) does not vanish for any x > there can be no other density points 
for an optimal plan, for otherwise the cost would be infinite. 

Clearly, Eq > 0, since E > 0. By definition of the infimum, there exists a minimizing 
sequence {x( n )} such that 

£(xW) -> Eq. 

The goal is to show that there is a convergent subsequence such that and 
£7( x ("fc)) -> Eq. 

Proposition 4 (Properties of minimizing sequences). Assume f is Lipshitz and f(x) ^ 
for any x > 0. In the minimization problem Q ; there exist two positive monotone 
sequences, {ofej^o' {^fc}fcLo> suc ^ that a k < b k , a k —> oo, 6^—7-00 and there is a minimizing 
subsequence {x^} such that a k < x^ < b k . 

Proof. First, we note that Eq < 4L is a bounded quantity, see the previous section. To prove 
existence of {b k }, we first observe that any minimizing sequence must satisfy E(x^) < 2Eq, 
for sufficiently large n. Thus, x\ < 2Eq = b\ and then xif{x\) < 2Eq. Therefore, 

, 2E 2E 

X2 < T7 s - 



f( Xl ) ~ f(2E ) 
We define then 62 = 2Eq/ f(2Eo). Proceeding by induction, 

b k+1 = 2E /f(b k (E )), (5) 
we obtain the desired sequence. Note that the sequence is strictly monotone as 

xf(x) <L<E < 2E , 

and therefore, the mapping §5§ cannot a fixed point x = 2Eq/ f{x). 

Thus, the sequence {b k } monotonically grows to infinity and it bounds the corresponding 
terms of the minimizing sequence. 

To establish lower bounding sequence we prova^] 

2 We use the notation Cl for the Lipshitz constant. 
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Lemma 1. Assume f is Lipshitz and let x be a monotone, possibly bi-infinite, sequence 
of turning points. Assume x m < 1/2Cl, then the modified sequence x with all Xj(j < m) 
removed, will have lower cost. 

Proof. Rewrite 

E ( x ) = ^2 x k+if(x k ) = ■■■ + x m -if(x m - 2 ) + x m f(x m -i) + x m+1 f(x m ) + ... 

k 

and the modified sequence 

E(x) = x m + x m+1 f(x m ) + .... 

We need to show 

Xm < . . . + X rn —if(x m —2) ~\~ X m f{x m —\). 

Rearranging some terms we get, 

1 ~ f{x m -l) < f{x m -2) 
Xjn—1 Xm 

The left handside is bounded by the Lipshitz constant C l and the right handside is bounded 
from below by l/x m — Cl- Therefore, by choosing x m < 1/2Cl, we obtain the desired 
result. □ 

Therefore, an optimal sequence of turning points is one-sided and there is at most one 
point in the interval [0, 5 = \j2C{\. Then, we let oq = and a\ = 5. 

Now, the sequence {a^} can be constructed using monotonicity a^+i > and that there 
are finitely many terms on any interval of, say, unit size: \5, 5 + 1|, \5 + 1, 5 + 2|, etc. 

Monotonicity has been proved in the previous section by showing that in nonmonotone 
sequence, by deleting the appropriate terms, we obtain a strictly monotone sequence with 
smaller cost. 

□ 

Theorem 1. There exists a converging subsequence, x( n ) — > x.(*\ where xM is strictly 
monotone and x^ — > oo. The cost function converges B(xW) -> £(x*) = E . 

Proof. Fix N > and let F^x = (xi, x 2 , ■ ■, xjv). For the minimizing sequence x^ n \ let 
x( ni ^ be a subsequence for which x™ 1 — > x\. Take a subsequence of this subsequence, so 
that P2X ( ™ 2 - ) — > {xl,X2}. Proceeding further and using diagonal subsequence x.( nk ^' k , we 
obtain a convergent subsequence, which we will still denote by x(") -> x*. The limit x* 
is a monotone sequence by construction. It must be also strictly monotone, for if not, 
i.e. if some terms are equal, we already know from the previous section that by removing 
repeated terms the cost is decreased, which contradicts the sequence being minimizing. 

Now, to prove the second part of the theorem, let E N (~x) denote N — th partial sum. Fix 
N > to be sufficiently large, and observe that E N (x^) — > E N (x*) just by continuity. 
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Because of the lower bounding sequence {a*;}, we can take N so large that x^ and x^jy 
are larger than any fixed number. Consider now the remainders 

E(^)-E N (^), E(x*)-E N (x*), 

which are arbitrarily small. Indeed, 

E(x) - E N (x) = x N+1 f(x N ) + x N+2 f(x N+ i) + ... 

and since the sequence is minimizing we can estimate the reminder by choosing, e.g. 
XN+k = 2 fc+1 X7v_i. Next, using an argument similar to the one used in Proposition |2j 
we obtain the bound 



poo 

E(x) -E N {x)<4 I 

Jx N -i 



f(x)dx. 



The same bound holds for the other reminder. Thus, taking xjv_i large enough we can 

assure the reminders to be arbitrarily small. This implies the convergence 

£(x*) = E Q . □ 

Next we demonstrate that the Lipshitz condition is necessary. Indeed, without it we can 
construct an example with no initial turning point: 

Example with singularity. If the tail distribution function is not Lipshitz then the 
sequence may fail to have the first turning point. Here, we present a simple example of 
one-sided search. 

Let f(x) = 1 — yfx and assume the search is done on the unit interval [0, 1]. It is also 
possible to modify this example to the infinite ray (0, oo) by changing f(x) outside of any 
neighborhood of so it does not vanish anywhere. 

Suppose, the optimal sequence is given by a one-sided sequence {0 < x\ < x 2 < X3 < . . .} 
with the cost 

£(x) =Xl + X 2 (l - ^/x{) + X 3 (l - y/x£) + .... 

Let us insert another point xq : < xq < x\, then the cost of modified sequence is given 

by 

E(x) = Xq + Xl(l — y/XQ) + X 2 (l - y/xi) + 

Comparing them, we find that the cost of modified sequence is lower if and only if 

XQ + Xl(l — y/xo) < X\ 44> y/x~Q < X\ 44> XQ < x\. 



The latter inequality can be always achieved. Therefore, the optimal sequence does not 
have an initial turning point. 



4. Pareto distribution 

In this section we present an explicit example which illustrates our general approach: 
the optimal plan of the search problem belongs to an invariant manifold (separatrix) of the 
associated Hamiltonian map. 
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4.1. Cost functional. Consider a Pareto type tail distribution (analogous to that of |14j ) 

f( x ) = x - a if x > 1, 

f{x) = 1 if < x < 1, 

where we assume that a > 1 in order to have a bounded expected value. 

We will use the notation, exceptionally, xq = 1, which makes formulas look simpler. 
Note that Xq = 1 does not correspond to an actual turning point. The expected cost is 
given by 

oo 

E(x) = xi + f(xi)x 2 + f(x 2 )x 3 + ... = 
The variational recursion reads in this case 



oo 

X n +1 



n=0 



Xk+l 



a x 



k-1 



or equivalently 

1 X k I Xl 



: xi. 



Therefore, for the sequences generated by the variational recursion, with x\ = x, we can 
immediately compute the cost 



oo 

E(x) = oT n x\ = x\ 

Z — ^ rv — 



a — 1 ' 

n=0 

clS cl function of the initial condition x\ = x. 

This expression indicates that x\ should be as small as possible, provided the sequence 
satisfies the constraints of monotonicity and unbounded growth. 

From the sequence definition, we have 

x k+l 1 / x k 



or denoting the ratios by = Xk/xk-i, 



r k+ i = a 1 r%, r\ = x\. 



_ i 

Defining Wk = r k a a ~ 1 gives 



w k+ i = w k . 

We clearly need to take w\ > 1, so that the ratios would not go to zero and the sequence 

Xk would be monotone. However, since we need x\ to be as small as possible, we take 

i 

w\ = 1, resulting in x\ = r\ = a a ~ x . Therefore, the minimal cost is given by 

1 a 

E o = T = 7> 6 

a — 1 a — 1 



and the optimal sequence is given by 



k 

Xk = a 01 - 1 . 



10 

In a particular case of a 
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= 2, the optimal sequence is given by geometric series Xk = 2 k . 



4.2. Hamiltonian dynamics. The global structure of the dynamics defined by the vari- 
ational recurrence in this homogeneous problem is shown on the Figure [3j Here we draw 
the invariant curves for the trajectories defined by R: the iterations of a point (x/t, x/c + i) 
found on one of these curves, stays on it forever. The red (thick) line corresponds to the 
optimal trajectory. 



A x k+i 




Figure 3. Phase portrait for the variational recursion for the homogeneous 
distribution. There are two regions: above the line x^+i = a l '^ a ~ ^Xk, 
where all the orbits monotonically grow and below, where all the orbits lose 
monotonicity eventually. 

The qualitative dynamics in this case can be summarized as follows: 

• There is a region of initial values x\ where the variational recursion stops making 
sense: the iterates become non-monotone. We will call this region c/iaoii(j^j 

• The optimal initial value is on the boundary of the chaotic set. 

• The growth of the optimal plan (exponential) is far slower than the growth for 
generic initial values outside the chaotic region (where it is super-exponential). 



Albeit the dynamics is not really chaotic in this particular case, we will see that this is rather an 
exception. 
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The sequences can be represented as solutions of the two dimensional nonlinear map 

x k+l = r k+l x k 



Tk+l = 




The ray r = r* = a^- 1 is invariant. Above this ray r = r* , the orbits go rapidly to 
infinity. The orbits below r = r* are not monotone, because monotonically decreases to 
zero and while Xk may grow at first but after becomes less than 1, Xk will be decreasing. 



5. Exponential tail distribution 

In this section we analyze in detail the prototypical case of exponential distribution. 
While, this case is sufficiently simple to allow complete understanding, the Hamiltonian 
dynamics is no longer integrable. Therefore, the methods that we develop would apply to 
other cases of interest. 



5.1. Variational recursion. We consider now several key properties of the variational 
recursion R : (x,y) ^ (y,-f(x)/f'(y)). 

One of the basic observation is that it preserves an area form: 

Proposition 5. The mapping R preserves the area form co = f'(x)dx A dy. 

This is a rather general fact: for any recursion obtained by extremization of the func- 
tional 

oo 

E ( x ) = ^2F(x k ,x k+ i), 

k=0 

the 2- form j^§^dxf\dy is invariant with respect to the associated two-dimensional mapping. 

It is possible to explicitly give the coordinates in which the variational recursion R is 
Hamiltonian: if we use (s,y), where s = f(x) in lieu of (x,y), then 

R:(s,y)^(f(y),-s/f(y)); 

it maps [0, 1] x R + into itself and preserves the Lebesgue area ds A dy. We will be referring 
to these coordinate system as standard. 

In the standard coordinates, the variational recursion for the exponentially distributed 
H (i.e. for f(x) = exp(— x)) is given by 

R:(s,y)^(e-y,sey). 

Further, one can see that R has a unique stationary point, s = e -1 ,y = 1. One can 
verify that this fixed point is elliptic. 
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Figure 4. Several orbits of the variational recursion for exponential dis- 
tribution. The solid curve separates chaotic region from the monotonicity 
region. The region of interest is located to the left of the vertical line x = 1. 
The monotone orbits outside of the chaotic region are not present as they 
are rapidly mapped to infinity. 

5.2. Cost functional and cost function. We already know that the optimal plan can be 
found only among the trajectories satisfying the variational recursion. We will set xq = 0; 
under this assumption the trajectories (not necessarily increasing) satisfying the variational 
recursion are parameterized by the first non-zero term x\ := x. We will be denoting the 
corresponding family of trajectories as x-r(x) = {xq = 0, x\ = x,X2 = x%(x), . . .}. For 
the exponentially distributed H, the first few terms of the family xr(i) are given by 
x\ = x;x2 = e x ; x% = e e *~ x and so on. 

Notation: We will use the term cost functional for ([2]), defined on the space of all tra- 
jectories x, while reserving the term cost function for the restriction of the functional E 
to the one-parametric curve xn(x) of solutions to variational recursion, denoting the cost 
function by E(x) := E(xn(x)). 

For exponentially distributed H, the cost function is finite on monotonic trajectories. 
Indeed, in this case, unless growing without bound, the trajectory should converge to the 
only fixed point of the variational recursion, which is impossible as it is an elliptic point. 
If for some K, xk > 1, then for k > K, 



Xk+i ~Xk = lnx fc+ 2 > hia^ > 0, 
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and Xk grows at least as an arithmetic progression, implying the convergence of 

oo oo 

E(x) = ^Xk+i exp(-x fc ) = y^exp(-Xfc-i). 

k=0 k=0 

Now, as the cost function E(x) is a function of one variable, and we established that 
the optimal trajectory should be one of the family xn(x), it might appear that the rest is 
straightforward: to find the minimum of E(x) over the starting point x\ = x. However, if 
we take the formal derivative 

^ = Y,^,(xk+i{x)f{x k {x)), 

k=0 

we will see that all the terms vanish, identically (precisely because £Cr(x) = {xi(x), X2(x), . . .} 
satisfies the variational recursion). It might appear that E(x) should be a constant! How- 
ever, we already computed E{x) in an example in section |4j and know that this is not the 
case. 

The reason for this calamity is, of course, the fallacious differentiation of an infinite sum 
of differentiable functions with wildly growing C 1 norms. 
However, if we consider the approximants 

K 

E K (x) = ^2x k+1 f(x k ), 

k=0 

they can be differentiated term by term, yielding 

dE K 



d,x 



(x) = x K+ i{x)f{x K {x)) (7) 



(by telescoping) . 

As E (x) approximates E(x) to within AEq f(xx), which uniformly converges to zero, 
the existence of a local minimum of E(x) in an interval where E is finite would imply that 
the approximants E K {x) have local minima in that interval, for all large enough K. Later 
we will use this observation to prove that the reduced cost function has optimal solution 
on the separatrix. 

6. Hamiltonian dynamics 

Denote by"P = {l>s>0,y>0} the phase space (in standard coordinates) on which 
the variational recursion acts. 

6.1. Chaotic and monotone regions. 

Definition 1. The region A4k of k-step monotonicity is defined as collection of points 
in V such that k-fold application of the R produces a monotonic (along y coordinate) 
sequence. The intersection of all M k * s denoted by Moo '■= ^kMk and is called the region 
of monotonicity. Its complement is called the chaotic region. 
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Figure 5. Invariant curve and iterated initial data in the exponential case 
in (y, z) coordinates. The long curve is the separatrix. It corresponds to 
the solid curve in Figure [4j The line segment with the end points (0, 0) 
and (1,1) represents a one parameter family of the initial turning points 
x\. Note that the segment intersects the separatrix at exactly two points. 
These two points are the candidates for the optimal search sequence. The 
other curves are obtained by iterating the initial segment by the forward 
map. 

The boundary S of the monotonicity region is called the separatrix. It is not immediate 
that the separatrix is a curve: the monotone and chaotic regions can have rather wild struc- 
ture. However, we will see that the separatrix is indeed a smooth curve, and the relevant 
part of it can be represented as a graph of a function in some appropriate coordinates. 

6.2. Existence of separatrix: exponential distribution. The existence of the sepa- 
ratrix in the phase space for the exponentially distributed H is proved by applying the 
standard Banach contraction mapping principle. 

We start by introducing more convenient coordinates in the phase spac^](x, y) — > (y, z = 
y — x). Thus, Zk+i = Xk+\ — Xk "measures" monotonicity of the orbits. 

In these new coordinates, the mapping R is given by 




(8) 



The inverse map in these coordinates acts as 




(9) 



Recall that (x,y) represent the successive points of the trajectory {xk, iCfe+i)- 
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The iterations of the boundary of monotonicity region {Z > 0} result in curves z = 4>k(y), 
where the functions (j>k satisfy the recursion 

<f> k+l (Y-<j> k (Y)) = ln(Y), 

or, equivalently, 

</>k+l(v) = M^kiv)), 
where ip k is defined as inverse to Y i— > Y — 4>k(Y). 

Proposition 6. The map (f> k i— > <j>k+i defined above is a contraction in the space of contin- 
uously differentiable positive functions with bounded derivative < <fr'(y) < 1/2 for y > 4. 
There is a continuous limit (j) = lim^oo (j) k , which solves the functional equation 

4>{y - <P(y)) = ln (y) 

and satisfies the bound \4>(y) — hi(y)| < 1 on y 6 [4, oo). 

By construction, the region below the separatrix S (in (y, z) coordinates) corresponds 
to the non-monotonic solutions of the variational recursion, and that above S correspond 
to monotonically increasing solutions. In other words, S is indeed the boundary of .A/Zoo* 

Proof. Consider the inverse map ([9]). It takes a graph (y,4>(y)) into a graph (y, 3>(y)), 
where 

$0)(y) = lnO^(y)), 

where w^y) solves the equation 

y = w^(y) - (f)(w^(y)). 

We consider this mapping in the space of continuously differentiable functions 

X = {cf> € C\y , oo), 0(y) > 0, < <f>'(y) < 1/2}. 

Note, that at each iteration we have a well defined function w = u^(y) and that w^{y) > y. 
Indeed, by the implicit function theorem, we need <p'{w) ^ 1, which we have since 4>'(y) < 
1/2 and w<f,(y) > y. 

First, show that we can iterate indefinitely: 

$(0)(y) = ln(u^(y)) > ln(y) > ln(y ) > 0, 

if yo > 1. Differentiating 

—$ y = — 1 — — <—-<-<-<-, (10) 

dy w<t>(y) w<p{y) 1 - w(y)J w^y) y y 2 

if yo > 4. Also, since w'Ay) > 0, we have 

~miy) > o. 

dy 

Now, we show that the mapping is a contraction in the space of continuous functions. 
Let y > yo and consider 

m<t>)( y ) - $(^)(y)| = | MMv)) ~ HMv))\ (ii) 
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< — ( — r\ — 7 yT • \w<t>(v) - w <p(y)\ < —\ w <t>(y) - ^Av)\- 

min{w^{y),w^(y)) v v y 

Now, observe that 

\w4,{y)-w^{y)\ = \4>{w^{y)) - ^(^(y))| < \^{w^{y))-(j){w^{y))\ + \(j){w^{y))-^{w^{y))\ 

< sup \4>'\ • \\w^{y) - w^{y)\ + sup \<j>{y) - tp(y)\. 
y>yo y>yo 

Therefore, 

i , v t . sup > ~ ^(y)l / _ . , v, 

K 2/ - w^{y)\ < f=^ rr^ < 2 sup \4>(y) - ip(y) 

1 -sup y > yo \<f)'\ y > yo 

and combining this inequality with (11), we obtain the contraction 

2 1 

sup \<f>(<t>)(y) - $(V0(y)l < — sup \<t>(y) - ip(y)\ < - sup \</>(y) - ij>(y)\, 
y>yo VO y>yo z y>yo 

assuming again that yo > 4. 

As usual, in the contraction argument, the distance between initial guess <f>o(y) = m(y) 
and the limit (j)(y) is bounded by — 4>q\ \ < 2\\c/>i — 4>o\\. Consider 

\My)-Mv)\ = \Ht(y))-Hy)\, 

where y = t(y) — ln(i(y)) with y > 4. Thus, 

| ln(t(j/)) - ln(y)| < 1 \t(y) - y\ < Vfo) - 1| • y = \t'{y) - 1| = 1 

y y 1%) - M 

where we used the derivative of the inverse function. Since, we assume that y > 4 which 
implies then t{y) > 2, we have 

\m-My)\<2\(t>M-My)\ < i- 

□ 

Now, we verify that the obtained separatrix is actually smooth. We need this property 
as we later prove that the cost function increases away from the separatrix. In fact, the 
separatrix is possibly an analytic function, see the appendix. 

Proposition 7. The separatrix is a continuously differentiable function on the interval 
[13, oo) satisfying the bound 

d ,, v 2 
dy y 

Proof. Now we consider contraction in the space of continuously differentiable functions 
with the norm 

\\4>\\i ■= sup |0| + sup \<f)'\ 
y>yo y>yo 

and with the bound 

W'(y)\ < i- 
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We will also use the notation 

\\4>\\o ■= sup \4>\. 

y>yo 

Using the definition of <3?(</>) and of w^, we calculate 

d>"(0)(y) = -*■ - ^ 

and 

„/ \2 



„ 0"K)K) 



Recalling that for yo > 4, we have < <f>'{y) < 1/2 and 1 < w'Ay) < 2 so that 

K(y)l<8|^(y)|<8. 

Next, we have 

yo yo 

Taking, e.g. yo = 10, we can ensure that the last expression is bounded by 1. 
Now, we prove that we indeed have contraction 

\m<t>(y)) - <f(V(2/))l|i = sup M^y)) - Hi>(y))\ + sup |*'(0(y)) - &ty{y))\. 

y>yo y>yo 

We already know that 

sup |*(0(y)) - *(^(y))| < -|0 - Vlo < —|0 - V'li- 
Now, we estimate 

Wip ~ 



|$'(0(y))-$'(V>(y))| 



Using the estimates obtained in the proof of Proposition [6j we have 

l^lkvl ~ yo 

and 

|^(y) - «ty(y)| < 2 sup |0(y) - ^(y)|. 
y>yo 

On the other hand, differentiating the identity 

y = «v(y) - H w 4>(y)) 

and using triangle inequalities, we can estimate the difference 

\w'^ - w'^\ < \4>'{w^)\ ■ |^ - u>J,| + |^|(|0'(w ) - ^'(to^| + |^'(^) - ^'K/OD- 
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The first difference on the right hand-side can be absorbed into the left hand-side as we 
did in the proof of Proposition [6j The second difference is estimated by 

l<A'K)-v'K)l < \\<P-tP\\i 

and the third one, 

l^'(uty) - il>'(w^)\ < \\ip"\\o ■ \w$ - Wip\ < \w<f, - uy,|, 

where \w$ — w^\ has been estimated in Proposition [6j 
Combining these inequalities, we obtain 

|*'(0(y))-*'ty(j/))|< ( 12 + 4 2 

By taking sufficiently large yo, e.g. yo = 13 we obtain contraction in C 1 . Having established 



continuous differentiability of <fi, the bound follows from the apriori estimate (10). 

□ 

Remark 1. By iterating the inverse map, one can show that the separatrix is smooth on 
a larger interval [1, oo). 

6.3. Properties of the separatrix. 

• By construction, the region below the separatrix S (in (y, z) coordinates) corre- 
sponds to the non-monotonic solutions of the variational recursion, and that above 
S corresponds to monotonically increasing solutions. In other words, S is indeed 
the boundary of .Moo- 

• Using functional equation, it is possible to obtain logarithmic series expansion of 
the function <j) defining the separatrix near y = oo (the derivation can be found in 
the appendix): 

<Xy)=ln(y) + ^M + ... 

V 

• In the standard coordinates, it is instructive to consider the separatrix as the stable 
invariant manifold of a topological saddle "at infinity". The intuition behind this 
picture underlies the construction of the separatrix. 

7. Cost function and optimal trajectories 

To understand the properties of the cost function and its approximations E N (x) we will 
need a standard trick from hyperbolic dynamics. There it is used to find fragile objects 
(invariant foliations) from robust ones (invariant cones), see e.g. |15j . 

7.1. Consistent cone fields. We will continue to work in (y,z) coordinates. 

We will refer to a pair of nowhere collinear vector fields (rj(y, z),£(y, z)) (or, rather, to 
the convex cone in the tangent spaces spanned by these vector fields) as the cone field 
Kr y ^, and to the vector fields n,^ as the generators of Kr y x \. We will say that the cone 
field Kr y ^ z \ is consistent at (y, z), if the variational recursion R maps it into itself, i.e. 
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here DH is the differential of R. For exponential H, it is given in the coordinates (y, z) 
by 

DR( y ,z)=(_° 1 

We will call a subset A of the quadrangle {y > 0, z > 0} a R-stable set if it is mapped 
into itself, i.e. K(A) C A. 

Proposition 8. The subset of the quadrangle A = {y > 0, z > max(0, 4>(y))} is a H-stable 
set. 

In other words, all the points in the positive quadrangle and above the separatrix do 
not leave that region under the action of R. This statement follows from invariance of the 
separatrix and that the ray {y = 0, z > 0} and the segment {0 < y < y* , z = 0} are 
mapped inside A, where (y*,0) is the point where the separatrix intersects y-axis. 

Now we will construct an explicit consistent cone field for the exponential H. It is in 
fact just the constant field, spanned by the tangent vectors n = (1, 2) and £ = (2, 1) 

A straightforward computation shows that in the region {z > In 4} the cone field gener- 
ated by 77 and £ is consistent, and we deduce 

Proposition 9. In the region z > In 4 above the separatrix, which is a H-stable set there 
exists a consistent cone field transversal to the vertical vector field (0,1). 

7.2. Monotonicity of the cost function on intervals of regularity. Now we are ready 
to prove the key fact about the cost function E(x). Consider the ray r := {(t, t), < t < 00} 
of initial conditions for the variational recursion. We will say that t* is a regular point, if 
some vicinity of t* in the ray r belongs to the monotone region Aioo- In other words, for 
the initial data xo = 0, x\ = t, where t is close to £*, the variational recursion generates an 
increasing trajectory, for which the cost function is a well defined function E(x). 
It turns out that x* cannot be a local extremum of E{x). 

Proposition 10. In (y,z) coordinates, if the region above the separatrix supports a consis- 
tent cone field K, with r/ being one of the generators, and n is not Ti-invariant then on any 
interval I = C r in the intersection of the ray of initial data with the monotone 

region Moo the function E(x) is monotone. 

Proof. Consider partial sums E N (x) which approximate E(x): 

N 

EN ( X ) = E /(*m)*m+l> ( 12 ) 

where the trajectory xr(x) solves the variational recursion. It is immediate that E N (x) is 
a smooth function of x, if f(x) is. 

As E N (x) converge pointwise to E(x), non-monotonicity of E on / would imply that for 
some compact subinterval J C I, all the functions E N have a critical point on J provided 
N is sufficiently large. 
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By 0, 

dE N ,dx N+ i 
ax ax 



N 



and criticality = is possible only if dxjy+i/dx = at some point x* of J. 

As the N-th iteration of the initial point (y, z) = (xi, x\ — x$) is (icjv+i, xn+1 — %n), the 
vanishing of dxjy+i/dx = means that in (y, z) coordinates the iV-th iteration by DR of 
the tangent vector to the ray r is vertical. 

However, the line of the initial conditions is the diagonal (y = t,z = t). Computer 
simulations, see Figure [5j show that after several iterates, the ray gets mapped into the 
cone field (above z = In 4) . 

As the K is consistent above the separatrix, the iterations of these tangent vectors 
under DR will still be in the interior of K, while the vertical vector field is the generator 
of K. Hence, dxx+i/dx cannot vanish on J, ensuring that vicinity cannot contain a local 
extremum of E. □ 

Therefore, the cost function can only achieve minimum at one of the points of intersection 
of the separatrix with the line of initial conditions. 

7.3. Simulations and optimal trajectories. In this section we present results of nu- 
merical computation of the cost function for the one-sided search problem. We also explain 
how our theory fits with these observations. 

Figure [6] shows the plots of the cost of the trajectories for the exponentially dis- 
tributed H, evaluated at both chaotic and monotone trajectories. The simulation was 
stopped either when the trajectories increased beyond some large threshold, or after a 
fixed number of steps (the former trigger would correspond to monotone trajectories; the 
latter to chaotic ones). 

The monotonicity of the cost over the left and right intervals is apparent. The separatrix 
S intersects the ray of initial conditions r at two points, x+ ~ 0.7465... and x_ ~ .1954... 
(compare with Figure [4]). Between the points, the initial conditions are in the chaotic 
region. The monotonicity of E outside of the chaotic region means that one of the two 
initial values, x+ or x_ should generate the optimal trajectory. Numerically, x+ wins: 
E(x+) w 2.3645 < E(x_) » 2.3861. 

8. Conclusion 

We developed a geometric approach to the Linear Search Problem via discrete time 
Hamiltonian dynamics, which explains some of the hidden structure of the cost function. 
The rapid decay of the tail distribution function translates into hyperbolicity of the under- 
lying Hamiltonian dynamics. The latter is defined by the variational recursion which plays 
a key role in the characteristics of the optimal search trajectory. In particular, hyperbol- 
icity implies the existence of separatrix which divides the regular and chaotic regions, and 
the optimal search trajectory needs to start on the separatrix: the chaotic region cannot 
contain optimal orbits, while in the regular region the orbits father away from separatrix 
have higher cost (monotonicity of the cost function) . 
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Figure 6. Numerically evaluated cost function E{x) for exponentially dis- 
tributed H . Right display shows also results for chaotic region (stopped 
after a fixed number of iterations). Left display is a magnification of the 
right one, showing only the results over the region of monotonicity. 

While this scenario is proved in this note only for a specific case of exponential tail 
distribution function, we anticipate that for other distributions with sufficiently fast decay, 
the same type of results, including the existence of separatrix and monotonicity of cost 
function in the region of monotonicity, will hold. Some of this hope is supported by partial 
results, see the appendix. 

We plan to return to this more general classes of distributions in a follow-up paper, 
where we also plan to address the phenomenon of separatrix slow-down (the growth of 
trajectories on separatrix is slower than that in the interior of the region of monotonicity). 

There are other open questions arising in the context of Hamiltonian dynamics based 
approach to the search problem. Extending the set of analyzed distributions to those with 
bounded support is a natural task. 

We also expect that in the search on rays, where the corresponding Hamiltonian map is 
higher dimensional, hyperbolicity will also play an important role and higher dimensional 
separatrix (unstable manifold) can be found. It is expected that optimal search plan would 
still be restricted to the unstable manifold. 
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Appendix A. Series expansions 
The expansion near x = oo for the separatrix given by 
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leads to logarithmic series 



4>(x — <j>(x)) = ln(x) 
' Q„(hi(x)) 



4>{x) = £ 



n=0 



X" 



\ + \Hx)-\^ 2 (x)\ ... 



The first three terms are given by 

4>(x) = ln(x) + i^M + 

X x^ 

To justify this expansion, we need 
Lemma 2. The equation x = t(x) — lni(x) has a smooth solution for sufficiently large x 

'lnx' 



t(x) = x + lnx + O 



x 



Proof. Let us write 

t(x) = x + lnx + r(x) 

and substitute in the equation. After some simplifications, we have 

, / lnx r(x) 
r = In H h 

\ x x 

Application of the contraction mapping principle to r(x) gives the required error estimate. 

□ 

Now, we prove 
Proposition 11. 



lnx + O 



lnx 

x 



Proof. Consider the first two iterations by R _1 of 4>q := (x = t, y = 0), 

4>i := (x = t, y = hit), <f>2 '■= [x = t — lni, y = lnt). 

They can be represented as graphs y = <f>i(x), y = 4>2(x) for sufficiently large x. Note that 
4>i(x) = ln(x), while (f)2(x) = lnt(x), where x = t(x) — lnt(x). 
Now, using the above lemma we estimate 



\4'2(x)—4>i(x)\ = |lnt(x)— lnx| = | In (x + lnx + r(x))— lnx| 



, lnx rix) 
In ( 1 + + 

x x 



< C 



lnx 



Applying contraction mapping principle, we obtain the desired estimate 

lnx 



lnxl < C 



x 



□ 
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Theorem 1. The mapping R restricted to the separatrix takes the form 

x n+1 = x n + hx(x n ) + 0(ln(x n )/x n ) 



Proof. The separatrix is given by 



4>(x) = ln(x) + p 



ln(x) 



x 



) 



for x 



oo. 



Then, using the forward map representation (x n+ \, y n +l) = (exp y n , — x n ), we have 



where p{x) = 0(lnx/x) is a smooth function. Applying the implicit function theorem and 
estimating the error term, we obtain the result. 



Theorem 2. The asymptotics of the mapping restricted to the separatrix is given by 

x n = n(ln(n) + ln(ln(n))) + r n , 
where r n is a sequence satisfying 

kn+l -r n \ <C. 

Proof. Substitute the expansion of x n in the recurrent relation 

Xn+1 = Xn + \n(x n ) + 0(ln(x n )/x n ), 

then after some cancellations, we obtain that r n+ \ = r n + 1 + 0(1) which implies the 
result. □ 

Appendix B. Two-sided Gaussian distribution: Beck-Bellman problem 

We consider the two-sided search on the real line with Gaussian probability distribution 
function as in the original Beck-Bellman problem and we show numerically that the same 
canonical structure persists: separatrix intersecting the curve of initial turning points. 

The difference relation obtained in [7], is given by 



The actual turning points are (— l) n x n , while x n > 0. For matlab computations, we use 



ln(x n+ i) + p(x n+1 ) = x n+ i - x. 



□ 



(x n + x n+1 )<f>(x n ) = G(x n ) + G(x n -i) 



where 



0(t) 




erfc(x) 




and the inverse function called erfcinv. Using the relation 
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Figure 7. Invariant curve and iterated initial data in Beck's problem. The 
long curve is invariant manifold. The other two bended curves are 1st and 
2nd forward iterates of initial data. The initial data itself is not present 
because continuation of the separatrix in that region is computationally too 
difficult. 

the finite difference relation takes the form 

(x n+1 +x n )(f>(x n ) = -(erfc(x n /\/2) + erfc(x n _i/\/2)). 

Now, using y n +i = x n +i — x n , we have 

1 



X„.+l 



(erfc(x n /\/2) + erfc((x n - y n )/V2)) - x n . (13) 



2(f){x n ) 

We will also use the inverse map which takes the form 

Xn+1 — X n Un 

Vn+i = x n+ i - \/2erfcinv (2(p(x n+1 )(x n + x n +i) - erfc(a; n+ i/v / 2)). 
In this case, the initial data is given by the line segment x\ = y\ = t. 

Appendix C. Gaussian tail distribution. One-sided search. 

In this section we verify that contraction mapping principle can be used to establish 
existence of separatrix for the one-sided search problem with Gaussian tail distribution. 
In this case f(x) = e~ x , so that the second order difference relation is given by 

T 11= — - — p X n~ x n-l 
Xn+l 2Xn 
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Let Dn+i = x n+i — x \, then we have 

1 „ 

x n+l 



1x n 

2 2 
Dn+i = x n+l ~ x n- 



We will also need the inverse map 



x n — \J X n+l Vn+1 

y n = ln(2x n x n+ i) 
In this case, the initial data is given by a quadratic curve 

2 .2 

y = x =t . 

Now, we show that the contraction principle can be extended to Gaussian case. 

Theorem 2 (Unstable invariant manifold for one-sided Gaussian). There exists an invari- 
ant manifold containing a graph y = h(x) on x G [xo, oo) and with 

\h(x) -ln{2x 2 )\ < 1. 

Proof. Set up contraction mapping 

= ln(2^(x)x), 

where 

~ <f>M x )) = %2 - 

Let 

X = {</> G C^xo, oo), <£(x) > 0, < 0'(x) < 1/2}. 

By applying the same argument as in the exponential case, we can ensure that <& leaves X 
invariant if we take as the initial guess (f>o(x) = ln(2x 2 ). 
To establish contraction, consider 

M<f>){x) - $(Y>)(x)| = | ln(2xz (x)) - ln(2x^(x))| = 

| ln(^(x)) - lnMx))| < ^ M l )Mx)) \**(*) ~ **(*)!■ 
Using the identity 

and that z ( p(x) > x, we have 

|^(x) - ^(x)| < ^ | ^ |0(^(a)) - ^(Z0(x))| 

< ^ (IIZ-sll + Ib'll • \z<t>( x ) ~ z^{x)\) . 




Figure 8. Invariant curve and iterated initial data. The longer curve is the 
invariant manifold. Two other curves are iterated initial turning points. 

Combining the terms, we have 

M x ) " M x )\ ^ 2x-\\^'\\ ~ ^" 

and then 

M<f>)(x) - *(V0(*)| < ^ • ^p^| II* " ^ 
Since we have assumed the bound < ip' < 1/2, taking x > 1, we obtain contraction 

^(x) - ^)(x)\ <i||0-^||. 



□ 
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